ReefCloud

Coral cover changes at large spatial scales

Authors
Affiliations

Julie Vercelloni

Australian Institute of Marine Science

Murray Logan

Australian Institute of Marine Science

Manuel Gonzalez-Rivero

Australian Institute of Marine Science

Kerrie Mengersen

Queensland University of Technology

1 Introduction

This document complements information displays on the ReefCloud Dashboard (https://reefcloud.ai/). It contains a description about the purposes of the spatio-temporal model adopted by ReefCloud to predict coral cover changes across spatial scales and how to interpret outputs.

We present the different aspects of the modelling framework using a working example with reproducible R codes available at https://github.com/ReefCloud.

2 Methods

2.1 Coral data

The working example uses observations of coral cover from the ongoing Long Term Monitoring Program (LTMP) (Emslie et al. 2020). In the region of the working example, the LTMP surveys seven coral reefs with three sites of five permanently marked transects at each reef. Observations from 2004 to 2022 at the transect scale are used in the model (Figure 1). These observations are derived from the outputs of the machine learning algorithms from the ReefCloud platform.

Figure 1: Coral cover trajectories at transect level within site.

2.2 Spatial scales of aggregation

To predict coral cover changes across large spatial scales, ReefCloud uses four different spatial scales available globally:

  • tier2 - country level
  • tier3 - levels of management regions
  • tier4 - levels of marine ecoregions of the world (Spalding et al. 2007)
  • tier5 - 5x5 km hexagonal grid

In the ReefCloud framework, coral cover observations can be extracted using levels from tier2, tier3 or tier4. In our working example, coral cover data are extracted within the limits of each tier4 level (Figure 2) due to the size of the country. Indeed, the area of the unit 1808 corresponds roughly to the size of Spain (14 millions hectares). This tier4 unit is composed of 1110 tier5 cells and 33 of them contain long-term observations in coral cover (Figure 3). Additional information within each tier5 cell includes the area occupied by coral reefs and unique reef identifier that were derived from the Reefs at Risk Revisited Project (Burke et al. (2011)).

Figure 2: Locations of tier4 units within Australia. The blue unit indicate the location of the working example.
Figure 3: Locations of tier5 within the tier4 “1808” area. The blue cells indicate the tier5 containing long-term observations in coral cover.

2.3 Disturbance data

Cyclone exposure

The spatial distribution of likely cyclone impacts was estimated using the 4MW model (Puotinen et al. 2016), which predicts the duration of damaging waves from cyclones by year. It is based on the duration and speed of modelled cyclonic winds, and estimates on the tier5 grid. The model defines damaging waves as the average of the highest one-third of wave heights over a sustained period of high winds, and which are four meters in amplitude or greater.

In the region of the working example, we detected increasing wave intensity in the years 2017 and 2018 associated with the cyclones Debbie that hit the region in 2017 at category 4 level and category 2 Iris in 2018 (Figure 4).

Figure 4: Exposure to damaging waves in hours used as a measure of cyclone impact within each tier5 cell of the tier4 area.

Heat stress

Degree Heating Weeks (DHW) was used to estimate heat stress impacts potentially leading to mass coral bleaching. DHW is derived from the Coral Bleaching HotSpot product that provides an instantaneous estimate of heat stress at a spatial resolution of 5 km over a 12-week window (Liu et al. 2018). The maximum DHW per year was used as indicator of heat stress event in the model. The year associated with DHW is modified when the maximum DHW occurred after field observations of the same year. In this case, associated DHW value is used for the next year.

In the region of the working example, heat stress events are detected for the years 2016, 2017 and 2020 (Figure 5).

Figure 5: Exposure to heat stress in degree heating weeks (DHW) used as a measure of bleaching impact within each tier5 cell of the tier4 area.

2.4 QA/QC

Prior analyses found a pervasive model behavior in the presence of extreme disturbance events in locations without data. To ensure quality and accuracy in model predictions, values of cyclone exposure and heat stress were not accounted (values replaced by NAs) when two following conditions applied:

  • values greater than the 97.5% quantiles of the tier4 distribution
  • absence of data in the corresponding tier5

For the region of the working example, values greater than 28 hours of exposure to cyclonic waves and 8 DHWs were considered out of the distribution. In 2017, values of cyclone exposure were not accounted for 50.3% of cells (n = 559) and 6.7% (n = 75) in 2018 (Figure 6). Some values of heat stress were not accounted for the years 2016, 2017 and 2020, with a maximum of 35.5% of cells (n = 394) in 2017 (Figure 7).

Figure 6: Locations of cells where values of cyclone exposure were replaced by NA for the corresponding year.
Figure 7: Locations of cells where values of cyclone exposure were replaced by NA for the corresponding year.

2.5 Spatio-temporal model

Introduction to the model

A spatio temporal model is developed to fit the following purposes:

  • predict changes in coral cover at large spatial scales
  • propagate uncertainty associated with predictions across spatial scales, appropriately
  • estimates of drivers of coral cover change # consider for the fine-scale coral cover dynamics from monitoring data

We addressed these criteria by modelling coral dynamics in space and time using a fixed-rank approach suitable for big data settings and non-Gaussian responses (Zammit-Mangion and Cressie 2021). The model uses spatio-temporal random effects that are based on an automatic placement of spatio-temporal basis functions constructed via a tensor product of spatial and temporal basis functions (Figure 8). These effects are spatially correlated to introduce a dependence (defined as spatial auto-correlation) between locations across three resolutions based on discrete global grids (Sahr 2008). Spatio-temporal random effects allow to interpolate values of coral cover at unobserved tier5 cells and thus increase the volume of information to interpret and scale-up predictions and uncertainty in an appropriate way. In this working example, the spatial component is composed of 120 functions. The temporal component uses 10 knots situated every 2.33 years years with a scaling factor of 2.1 (see (Zammit-Mangion and Cressie 2021) for additional information on the basis functions).

Figure 8: Spatial (left) and temporal (right) basis functions used in the model.

Coral dynamics is also estimated while considering for the effects of heat stress events and cyclones at lag 0, 1 and 2 and reef level variations. Trajectories of coral cover are scaled-up by weighting posterior distributions by reef area within each tier5 cell and summing them by year. The posterior median coral cover and associated uncertainty are estimated from the 95% highest posterior density intervals of their respective distributions (Kay 2023) based on 400 draws. In the ReefCloud pipeline, this process is repeated across all tier5 cells within tier4, tier3 and tier2 to produce coral cover trajectory across large spatial scales. In this working example, the spatio-temporal model is implemented using coral cover observations from a unique tier4 unit.

The spatio-temporal hierarchical generalized linear mixed model consists of two conditional-probability layers (Sainsbury-Dale, Zammit-Mangion, and Cressie 2021). In the process layer, coral dynamics is modelled according to different types of spatio-temporal variability attributed to disturbances, random variations that occurred at the medium spatial scales (\(\mathrm{v}\)) and other independent variations at fine scale (\(\mathrm{\zeta}\)) and reef scale (\(\mathrm{V}\)). In the data layer, the nature of the data is used to determine the probability distribution of the model. Here, we use a binomial distribution with coral cover expressed as counts per tier5 cell. Indeed, ReefCloud methodology uses a sample of 50 points within each image, so that abundances of corals \(y_{it}\) for observation \(i\) sampled at location \(\boldsymbol{s}\) and time \(t\) are integers between 0 and 50. The binomial distribution is a function of a mean latent process, \(\mu_{it}\) that are then averaged across tier5 cells by year and linked to the process layer via an inverse-logit transformation. This step facilitates the spatial change-of-support between observations in coral cover at the transect level and model predictions at the tier5 level.

Model equations

\[ \begin{align} {y_{it}} &\overset{\rm ind}{\sim} {\sf{Bin}}\bigg({\rm logit}^{-1}(\rm{Y}(\boldsymbol{s}))\bigg) , \quad \boldsymbol{s} \in \rm{Tier4}\\ \rm{Y}(\boldsymbol{s}) &= t(\boldsymbol{s})^\intercal\alpha + v(\boldsymbol{s}) +\zeta(\boldsymbol{s}) + \rm{V}(\boldsymbol{s}),\\ v(s) &= \phi(\boldsymbol{s})^\intercal\eta,\\ \eta &\sim \rm{Gau}(0,\rm{Q}^{-1}),\\ \zeta, \rm{V} &\sim \rm{Gau}(0, \sigma_{.}^{2})\\ \end{align} \tag{1}\]

where \(\rm{Y}(\boldsymbol{s})\) corresponds to the averaged mean latent process with \(\boldsymbol{s}\) the location of tier5 units within a tier4 area. \(t(\boldsymbol{s})^\intercal\) and \(\phi(\boldsymbol{s})^\intercal\) are estimated using known design matrices constructed from spatially referenced covariates and spatio-temporal basis functions, respectively. \(\alpha\) represents fixed effects of each covariate - effects of of stress events and cyclones at lag 0, 1 and 2. The medium spatial scale variation, \(\rm{v}\), is also conditioned on a vector of random coefficients \(\eta\) modelled as a multivariate Gaussian random vector with an inversed covariance kernel, \(\mathrm{Q}^{-1}\). \(\zeta\) and \(\rm{V}\) are independent random effects at the fine-scale and reef level, respectively.

Model implementation

The spatio-temporal model is fitted under an hybrid Bayesian and Frequentist framework using the R package FRK version 2.1.5 (Sainsbury-Dale, Zammit-Mangion, and Cressie 2021). A Monte Carlo approach is used to infer on the averaged mean latent process (\(\rm{Y}(\boldsymbol{s})\)) and obtain a posterior predictive distribution of coral cover for each tier5. Then model parameters and and fixed effects are treated as non-random quantities where only a point estimate is kept by the model, in line with FRK v1 (Zammit-Mangion and Cressie 2021). This framework is key for ReefCloud because it allows to substantially decrease the computational speed of the model.

The best model formulation was selected using visual and statistical diagnostics including model fit, residual patterns using the DHARMaR package (Hartig 2023), basis dimensions and Akaike Information Criterion (AIC) values. In addition, because model predictions are on a different spatial scale than the observations, we developed a leave-out data approach to proceed further evaluations of model performance. We used five prediction-performance measures to assess model predictions with the exclusion of different data (random data and by blocks). Results from these analyses are shown in Section 5.

3 Results

3.1 Tier5

At the tier5 level, predictive values in coral cover and associated uncertainty changed in space and time (Figure 9; Figure 10). It is expected when using spatio-temporal model that the uncertainty remains low at predictive locations with and close to the data and increase further away their locations. In this working example, we observed this with a large uncertainty in the southern section of the predictive area ranging between 60-100% and varying across years (Figure 10). Despite, the model captures well coral cover trajectories in tier5 with data (Figure 11) and surrounding cells (Figure 13). Predictions in coral cover are also available outside the temporal range of observed values as shown for the tier5 “7116” where observations in coral cover are only available for the years 2007-2013 whereas the model predicted changes from 2004 to 2022 (Figure 11).

Predictions

Figure 9: Spatio-temporal predictions of coral cover.

Uncertainty associated with predictions

Figure 10: Uncertainty associated with spatio-temporal predictions of coral cover. Uncertainty is represented using the width of the 95% credible intervals from posterior predictive distributions.

Coral cover trajectories of tier5 with data

Figure 11: The black lines show averaged predictions at the tier5 level and blue ribbons associated 95% credible intervals. The grey lines show observed coral cover trajectories at the transect level.

Coral cover trajectories of tier5 without data

Figure 12: Examples of model fit using random tier5 cells that are close from observations (in red) and far (in green). The blue hexagons show the locations of tier5 with data. Associated coral cover trajectories are shown in the figure below.
Figure 13: Examples of model fit using random tier5 cells that are close from observations (top panel) and far (bottom panel). Black lines show the estimated coral cover and ribbons associated uncertainty. Associated locations are shown in the figure above.

3.2 Tier4

At the tier4 level, coral cover is estimated in square kilometer because the predictions were weighted by the area of coral reef within each tier5 cell. In this working example, predicted coral cover trajectory at the tier4 level remained relatively constant before decreasing in 2012. An increased in coral cover was predicted from 2012-2017 reaching almost 500 sq km before decreasing again in 2019 (Figure 14). We found that cyclone impact at lag1 drove coral cover loss in this region (Figure 15) - 95% credible intervals are negative. Predicted decreased in 2012 could be associated with the impact of category 5 cyclone Yasi that hit the region in February 2011. The model estimates an absolute loss in coral cover of 105 (63-143, 95% CI) sq km for the entire region. Similarly, the decrease between 2018-2019 could be attributed to cyclone Debbie that hit the region at category 4 in March 2017. We estimated that 120 (71-171, 95% CI) sq km of reef area was loss by this disturbance.

Predicted trajectory

Figure 14: The black line shows the estimated coral cover and ribbons associated uncertainty.

Effect of disturbances

Figure 15: Estimated size effects for each covariate.

4 Acknowledgements

The authors would like to thank the research team that contribute to the model and pipeline development. We are grateful for the support of the FRK fathers Andrew Zammit Mangion and Matthew Sainsbury-Dale and their contribution in developing new functions to better match with ReefCloud purposes. We thank Marji Puotinen and Ronen Galaiduk for their work with the disturbance spatial layers, Luz Valerie Pascal to improve the modelling pipeline and Brendan Ford to create the report template. This work was funded by the Australian Institute of Marine Science and the Australian Department of Foreign Affairs and Trade.

5 Appendix

Our model selection analysis compares five formulation of the FRK model combining the inclusion and exclusion of covariates and reef level random effects. The best model formualtion was selected by looking at the Akaike Information Criteria (AIC) values using the minimum value as indicator of best model formulation but also we investigated the range of predicted coral cover values and associated uncertainty, estimated effect of disturbances and predicted coral cover trajectory at tier4 level across model formulations.
We examined five model formulations:

  • Model full: \(\text{Coral cover}\) ~ 1 + \(\text{Heat stress}\) + \(\text{Cyclone exposure}\) + \(\text{Heat stress}_{lag1}\) + \(\text{Heat stress}_{lag2}\) + \(\text{Cyclone exposure}_{lag1}\) + \(\text{Cyclone exposure}_{lag2}\) + (1 | \(\text{reef}\))
  • Model random only: \(\text{Coral cover}\) ~ 1 + (1 | \(\text{reef}\))
  • Model covariates only: \(\text{Coral cover}\) ~ 1 + \(\text{Heat stress}\) + \(\text{Cyclone exposure}\) + \(\text{Heat stress}_{lag1}\) + \(\text{Heat stress}_{lag2}\) + \(\text{Cyclone exposure}_{lag1}\) + \(\text{Cyclone exposure}_{lag2}\)
  • Model covariates (no lags) + random: \(\text{Coral cover}\) ~ 1 + \(\text{Heat stress}\) + \(\text{Cyclone exposure}\) + (1 | \(\text{reef}\))
  • Model covariates only (no lags): \(\text{Coral cover}\) ~ 1 + \(\text{Heat stress}\) + \(\text{Cyclone exposure}\)

Results show that Model full formulation produces the best model outputs based on the AIC values (Table 1).

Overall, the effect sizes of fixed effects are pretty similar between model formulation (Figure 16). Lagged covariates contribute to reduce the uncertainty associated with the estimation of disturbance impacts with a substantive negative effects of cyclone exposure detected in the models that include them. The inclusion of random effects at the reef level decreases the range of predicted values of coral cover (Figure 17) and associated uncertainty (Figure 18) at tier5 level. At tier4 level, lagged covariates reduce the magnitude of positive changes in coral cover between 2012-2018 and random effects reduce associated uncertainty (Figure 19).

Table 1: Values of AIC for each model formulation
Model AIC
Model full 3337.94
Model random only 3345.08
Model covariates only 3338.30
Model covariates (no lags) + random 3344.63
Model covariates only (no lags) 3351.49
Figure 16: Estimated values of disturbance effects for each model formulation.
Figure 17: Estimated predicted coral cover for each model formulation. Black dots correspond to the average values, black lines show 95% credible intervals and grey distributions are the distribution of values.
Figure 18: Estimated uncertainty for each model formulation. Uncertainty corresponds to the difference between the upper and lower values of the 95% credbile intervals associated with coral cover predictions. Black dots correspond to the average values, black lines show 95% credible intervals and grey distributions are the distribution of values.
Figure 19: Weigthed coral cover trajectory at tier4 level. The black line shows averaged area in coral cover and blue ribbons associated 95% credible intervals.
Table 2: Computing time (in min) for each model formulation.
Model time
Model full 16.71
Model random only 13.17
Model covariates only 12.88
Model covariates (no lags) + random 15.71
Model covariates only (no lags) 10.68

The leave-out data approach is used to evaluate the influence of specific observations using prediction-performance measures. The full model is fitted on a train dataset composed of a random sample of observations and prediction-performance measures are computed on the leave-out observations (Sainsbury-Dale, Zammit-Mangion, and Cressie 2021).

Five validation tests are developed:

  • 1 - rm(20% obs): 20% of data were randomly removed without structure.
  • 2 - rm(20% reef): 20% of reefs were randomly removed.
  • 3 - rm(20% site): 20% of sites were randomly removed.
  • 4 - rm(20% transect): 20% of transects were randomly removed.
  • 5 - rm(4YRS): 4 years of observations were removed within each location (locations with less than 4 years of data were not used).

Five predictive measures are used:

  • 95% coverage - evaluates the width of prediction intervals.
  • 95% interval score - rewards narrow prediction interval and penalizes instances when observed coral cover value misses the interval.
  • Root-mean-squared prediction error - assesses point-wise predictive performance.
  • Continuous Ranked Probability Score - evaluates the predictive cumulative distribution function of the mean process.
  • Akaike criterion - assesses how well the model fits the data set.
Table 3: Measures of performance for each test.
Test 95% coverage 95% interval score Root-mean-squared prediction error Continuous ranked probability score AIC
1 - rm(20% obs) 0.57 0.60 0.11 0.06 3323.16
2 - rm(20% reef) 1.00 0.37 0.11 0.10 2844.93
3 - rm(20% site) 0.50 0.86 0.14 0.07 3058.58
4 - rm(20% transect) 0.61 0.55 0.10 0.05 3257.73
5 - rm(4YRS) 0.64 0.49 0.11 0.06 3131.64
Table 4: Computing time (in min) for each test.
Test time
1 - rm(20% obs) 13.81
2 - rm(20% reef) 10.44
3 - rm(20% site) 14.75
4 - rm(20% transect) 16.41
5 - rm(4YRS) 11.81

The aim of this analysis is to explore the influence of the basis function formulation with a focus on the temporal dimension. To do this, we compare five model performance using different number of knots in the temporal basis function (Figure 20):

  • Model full: number and location of temporal knots automatically estimated from the FRK function and adopted by ReefCloud (Section 2.5).
  • Model 5K: five temporal knots randomly selected from the FRK auto basis function.
  • Model 3K: three temporal knots randomly selected from the FRK auto basis function.
  • Model 1K: one temporal knot randomly selected from the FRK auto basis function.
  • Model noK: no temporal basis function - spatial model only.

For each model, the spatial component is similar than model full (Figure 8). At the tier5 level, the inclusion of additional knots tends to smooth estimated trajectory of coral cover and associated uncertainty (Figure 21). Interestingly, this trend is not propagated at the tier4 level with a higher uncertainty estimated for the model full and lower for the model with one temporal knot and none (Figure 22). Also, we estimate different effect size of disturbances across the models with a positive (but not significant) effect of cyclone exposure for the model full, five and three knots and negative (but not significant) otherwise (Figure 23). Results show that Model full formulation produces the best model outputs based on the AIC values (Table 5).

Figure 20: Locations of the knots for each temporal component. The x axis corresponds to the years.
Figure 21: Model fits for a unique tier5 with different temporal components. The black lines show averaged predictions at the tier5 level and blue ribbons associated 95% credible intervals. The grey lines show observed coral cover trajectories at the transect level.
Figure 22: Weigthed coral cover trajectory at tier4 level. The black line shows averaged area in coral cover and blue ribbons associated 95% credible intervals.
Figure 23: Estimated values of disturbance effects.
Table 5: AIC table.
Model name AIC
Model full 3338.33
Model 5K 3381.73
Model 3K 3399.61
Model 1K 3519.24
Model noK 3505.24
Table 6: Computing time (in min) for each test.
Model time
Model full 26.13
Model 5K 5.45
Model 3K 3.54
Model 1K 1.83
Model noK 0.26
Figure 24: Model residuals diagnostics.

References

Burke, Lauretta, Katie Reytar, Mark Spalding, and Allison Perry. 2011. Reefs at Risk Revisited. Reefs at Risk Revisited.
Emslie, Michael J., Peran Bray, Alistair J. Cheal, Kerryn A. Johns, Kate Osborne, Tane Sinclair-Taylor, and Cassandra A. Thompson. 2020. “Decades of Monitoring Have Informed the Stewardship and Ecological Understanding of Australias Great Barrier Reef.” Biological Conservation 252 (December): 108854. https://doi.org/10.1016/j.biocon.2020.108854.
Hartig, Florian. 2023. “DHARMa: Residual Diagnostics for Hierarchical (Multi-Level/Mixed) Regression Models.” https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html.
Kay, Matthew. 2023. ggdist: Visualizations of Distributions and Uncertainty. https://doi.org/10.5281/zenodo.3879620.
Liu, Gang, C. Mark Eakin, Mingyue Chen, Arun Kumar, Jacqueline L. De La Cour, Scott F. Heron, Erick F. Geiger, William J. Skirving, Kyle V. Tirak, and Alan E. Strong. 2018. “Predicting Heat Stress to Inform Reef Management: NOAA Coral Reef Watchs 4-Month Coral Bleaching Outlook.” Frontiers in Marine Science. https://doi.org/10.3389/fmars.2018.00057.
Puotinen, Marji, Jeffrey A. Maynard, Roger Beeden, Ben Radford, and Gareth J. Williams. 2016. “A Robust Operational Model for Predicting Where Tropical Cyclone Waves Damage Coral Reefs.” Scientific Reports 6 (1). https://doi.org/10.1038/srep26009.
Sahr, Kevin. 2008. “Location Coding on Icosahedral Aperture 3 Hexagon Discrete Global Grids.” Computers, Environment and Urban Systems 32 (3): 174–87. https://doi.org/10.1016/j.compenvurbsys.2007.11.005.
Sainsbury-Dale, Matthew, Andrew Zammit-Mangion, and Noel Cressie. 2021. “Modelling Big, Heterogeneous, Non-Gaussian Spatial and Spatio-Temporal Data Using FRK.” arXiv Preprint arXiv:2110.02507.
Spalding, Mark D., Helen E. Fox, Gerald R. Allen, Nick Davidson, Zach A. Ferdaña, Max Finlayson, Benjamin S. Halpern, et al. 2007. “Marine Ecoregions of the World: A Bioregionalization of Coastal and Shelf Areas.” BioScience 57 (7): 573–83. https://doi.org/10.1641/b570707.
Zammit-Mangion, Andrew, and Noel Cressie. 2021. “FRK: An r Package for Spatial and Spatio-Temporal Prediction with Large Datasets.” Journal of Statistical Software 98 (4). https://doi.org/10.18637/jss.v098.i04.